stats: auto-populate node.id and node.cluster in OTel stats sink resource#44066
stats: auto-populate node.id and node.cluster in OTel stats sink resource#44066Retr0-XD wants to merge 4 commits intoenvoyproxy:mainfrom
Conversation
|
Hi @Retr0-XD, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
…urce The OpenTelemetry stats sink previously emitted metrics with an empty Resource (or only telemetry.sdk.* attributes). Consumers downstream — such as OTel Collector processors — had no way to identify which Envoy instance emitted the metrics. This change automatically injects `node.id` and `node.cluster` from `LocalInfo` into the resource attributes of every export request. Resource detectors configured by the user retain priority: if a detector already sets those keys, the auto-populated values are skipped (try_emplace semantics). Fixes: envoyproxy#39931 Signed-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
59ecd90 to
6b784d5
Compare
|
I am not sure, but I think now we have the Like https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/tracers/opentelemetry/resource_detectors/v3/static_config_resource_detector.proto#extension-envoy-tracers-opentelemetry-resource-detectors-static-config or https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/tracers/opentelemetry/resource_detectors/v3/environment_resource_detector.proto#extension-envoy-tracers-opentelemetry-resource-detectors-environment cc @kyessenov because Kuat may have more context about this PR. |
|
@wbpcode can I get a review on this please :) |
|
I don't think we should be using To set per-envoy attributes, you can already do it as follows:
|
|
I'm putting a block because I think this would be a breaking change. The server-side in OTLP uses attribute presence for mapping and can be tripped off with a new default attribute. |
Changed test expectations to use standard OTel semantic conventions: - node.id → service.instance.id - node.cluster → service.namespace This aligns tests with the switch from custom attributes to standard OTel conventions for Envoy instance identification. Signed-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
Changed implementation to use standard OpenTelemetry semantic conventions instead of custom node attributes: - node.id → service.instance.id (identifies the Envoy instance) - node.cluster → service.namespace (provides cluster/namespace context) This aligns with OTel standards and reduces risk of breaking changes for server-side OTLP attribute processing logic. Signed-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
Updated test expectations to use standard OTel semantic conventions: - node.id → service.instance.id - node.cluster → service.namespace Aligns with the switch from custom attributes to standard OTel conventions for Envoy instance identification. Signed-off-by: Retr0-XD <sakthi.harish@edgeverve.com>
Problem
The OpenTelemetry stats sink emits metrics with an empty
Resource(onlytelemetry.sdk.*attributes). Consumers such as OTel Collector processors have no way to identify which Envoy instance sent the metrics — making correlation of configs with metrics impossible.Reported in #39931.
Solution
Automatically inject two resource attributes from
LocalInfowhen building the sink:node.idLocalInfo::nodeName()"envoy-pod-1"node.clusterLocalInfo::clusterName()"my-service"Priority rule: configured resource detectors take precedence —
try_emplaceensures detector-set values are never overwritten.Files changed
source/extensions/stat_sinks/open_telemetry/config.cc— injectnode.id/node.clusterbefore constructingOtlpOptionstest/extensions/stats_sinks/open_telemetry/config_test.cc— factory smoke-test with non-empty node info + detector-priority testtest/extensions/stats_sinks/open_telemetry/open_telemetry_impl_test.cc— end-to-end flush test verifying attributes appear inResourceMetricsExample config
No config change required. With the existing bootstrap:
The exported
ResourceMetrics.resource.attributeswill now include:AI disclosure: GitHub Copilot was used during implementation and test writing. I fully understand all changes made in this PR.
Commit Message: See PR title
Risk Level: Low
Testing: Unit tests added/verified
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: N/A